Mining Actionable Subspace Clusters in Sequential Data
نویسندگان
چکیده
Extraction of knowledge from data and using it for decision making is vital in various real-world problems, particularly in the financial domain. We identify several financial problems, which require the mining of actionable subspaces defined by objects and attributes over a sequence of time. These subspaces are actionable in the sense that they have the ability to suggest profitable action for the decision-makers. We propose to mine actionable subspace clusters from sequential data, which are subspaces with high and correlated utilities. To efficiently mine them, we propose a framework MASC (Mining Actionable Subspace Clusters), which is a hybrid of numerical optimization, principal component analysis and frequent itemset mining. We conduct a wide range of experiments to demonstrate the actionability of the clusters and the robustness of our framework MASC. We show that our clustering results are not sensitive to the framework parameters and full recovery of embedded clusters in synthetic data is possible. In our case-study, we show that clusters with higher utilities correspond to higher actionability, and we are able to use our clusters to perform better than one of the most famous value investment strategies.
منابع مشابه
An Efficient Actionable 3D Subspace Clustering Based on Optimal Centroids
An efficient Actionable 3D Subspace Clustering based on Optimal Centroids from continuous valued data represented three dimensionally which is suitable for real world problems profitable stocks discovery , biologically significant protein residues etc. It achieves actionable patterns ,incorporation of domain knowledge which allows users to choose the preferred utility(profit/benefit) function, ...
متن کاملIntegration of Subspace Clustering and Action Detection on Financial Data
Object, attribute and context information are linked in the dimensional data models. Cluster quality is decided with domain knowledge and parameter setting requirements. CAT Seeker is a centroidbased actionable D subspace clustering framework. CAT Seeker framework is used to find profitable actions. Singular value decomposition, numerical optimization and D frequent itemset mining methods are i...
متن کاملExploring Constraints Inconsistence for Value Decomposition and Dimension Selection Using Subspace Clustering
The datasets which are in the form of object-attribute-time is referred to as threedimensional (3D) data sets. As there are many timestamps in 3D datasets, it is very difficult to cluster. So a subspace clustering method is applied to cluster 3D data sets. Existing algorithms are inadequate to solve this clustering problem. Most of them are not actionable (ability to suggest profitable or benef...
متن کاملLess is More: Non-Redundant Subspace Clustering
Clustering is an important data mining task for grouping similar objects. In high dimensional data, however, effects attributed to the “curse of dimensionality”, render clustering in high dimensional data meaningless. Due to this, recent years have seen research on subspace clustering which searches for clusters in relevant subspace projections of high dimensional data. As the number of possibl...
متن کاملSelect actionable positive or negative sequential patterns
Negative sequential patterns (NSP) refer to sequences with non-occurring and occurring items, and can play an irreplaceable role in understanding and addressing many business applications. However, some problems occur after mining NSP, the most urgent one of which is how to select the actionable positive or negative sequential patterns. This is due to the following factors: 1) positive sequenti...
متن کامل